170 research outputs found
Small Is Not Always Beautiful
Peer-to-peer content distribution systems have been enjoying great
popularity, and are now gaining momentum as a means of disseminating video
streams over the Internet. In many of these protocols, including the popular
BitTorrent, content is split into mostly fixed-size pieces, allowing a client
to download data from many peers simultaneously. This makes piece size
potentially critical for performance. However, previous research efforts have
largely overlooked this parameter, opting to focus on others instead. This
paper presents the results of real experiments with varying piece sizes on a
controlled BitTorrent testbed. We demonstrate that this parameter is indeed
critical, as it determines the degree of parallelism in the system, and we
investigate optimal piece sizes for distributing small and large content. We
also pinpoint a related design trade-off, and explain how BitTorrent's choice
of dividing pieces into subpieces attempts to address it
Accelerating MCMC via Parallel Predictive Prefetching
We present a general framework for accelerating a large class of widely used
Markov chain Monte Carlo (MCMC) algorithms. Our approach exploits fast,
iterative approximations to the target density to speculatively evaluate many
potential future steps of the chain in parallel. The approach can accelerate
computation of the target distribution of a Bayesian inference problem, without
compromising exactness, by exploiting subsets of data. It takes advantage of
whatever parallel resources are available, but produces results exactly
equivalent to standard serial execution. In the initial burn-in phase of chain
evaluation, it achieves speedup over serial evaluation that is close to linear
in the number of available cores
Cache craftiness for fast multicore key-value storage
We present Masstree, a fast key-value database designed for SMP machines. Masstree keeps all data in memory. Its main data structure is a trie-like concatenation of B+-trees, each of which handles a fixed-length slice of a variable-length key. This structure effectively handles arbitrary-length possiblybinary keys, including keys with long shared prefixes. [superscript +]-tree fanout was chosen to minimize total DRAM delay when descending the tree and prefetching each tree node. Lookups use optimistic concurrency control, a read-copy-update-like technique, and do not write shared data structures; updates lock only affected nodes. Logging and checkpointing provide consistency and durability. Though some of these ideas appear elsewhere, Masstree is the first to combine them. We discuss design variants and their consequences.
On a 16-core machine, with logging enabled and queries arriving over a network, Masstree executes more than six million simple queries per second. This performance is comparable to that of memcached, a non-persistent hash table server, and higher (often much higher) than that of VoltDB, MongoDB, and Redis.National Science Foundation (U.S.). (Award 0834415)National Science Foundation (U.S.). (Award 0915164)Quanta Computer (Firm
Prolac--a language for protocol compilation
Thesis (M.S.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1998.Includes bibliographical references (p. 81-83).by Eddie Kohler.M.S
Clustering and Sharing Incentives in BitTorrent Systems
Peer-to-peer protocols play an increasingly instrumental role in Internet content distribution. Consequently, it is important to gain a full understanding of how these protocols behave in practice and how their parameters impact overall performance. We present the first experimental investigation of the peer selection strategy of the popular BitTorrent protocol in an instrumented private torrent. By observing the decisions of more than 40 nodes, we validate three BitTorrent properties that, though widely believed to hold, have not been demonstrated experimentally. These include the clustering of similar-bandwidth peers, the effectiveness of BitTorrent's sharing incentives, and the peers' high average upload utilization. In addition, our results show that BitTorrent's new choking algorithm in seed state provides uniform service to all peers, and that an underprovisioned initial seed leads to the absence of peer clustering and less effective sharing incentives. Based on our observations, we provide guidelines for seed provisioning by content providers, and discuss a tracker protocol extension that addresses an identified limitation of the protocol
Phase Reconciliation for Contended In-Memory Transactions
Multicore main-memory database performance can collapse when many transactions contend on the same data. Contending transactions are executed serially—either by locks or by optimistic concurrency control aborts—in order to ensure that they have serializable effects. This leaves many cores idle and performance poor. We introduce a new concurrency control technique, phase reconciliation, that solves this problem for many important workloads. Doppel, our phase reconciliation database, repeatedly cycles through joined, split, and reconciliation phases. Joined phases use traditional concurrency control and allow any transaction to execute. When workload contention causes unnecessary serial execution, Doppel switches to a split phase. There, updates to contended items modify per-core state, and thus proceed in parallel on different cores. Not all transactions can execute in a split phase; for example, all modifications to a contended item must commute. A reconciliation phase merges these per-core states into the global store, producing a complete database ready for joined phase transactions. A key aspect of this design is determining which items to split, and which operations to allow on split items.
Phase reconciliation helps most when there are many updates to a few popular database records. Its throughput is up to 38x higher than conventional concurrency control protocols on microbenchmarks, and up to 3x higher on a larger application, at the cost of increased latency for some transactions.Engineering and Applied Science
Toward Secure Services from Untrusted Developers
We present a secure service prototype built from untrusted,contributed code.The service manages private data for a variety of different users, anduser programs frequently require access to other users' private data.However, aside from covert timing channels, no part of the service cancorrupt private data or leak it between users or outside the systemwithout permission from the data's owners.Instead, owners may choose to reveal their data in a controlled manner.This application model is demonstrated by Muenster, a job searchwebsite that protects both the integrity and secrecy of each user's data.In spite of running untrusted code, Muenster and other services canprevent overt leaks because the untrusted modules are constrained bythe operating system to follow pre-specified security policies, whichare nevertheless flexible enough for programmers to do useful work.We build Muenster atop Asbestos, a recently described operating systembased on a form of decentralized information flowcontrol
Making information flow explicit in HiStar
HiStar is a new operating system designed to minimize the amount of code that must be trusted. HiStar provides strict information flow control, which allows users to specify precise data security policies without unduly limiting the structure of applications. HiStar's security features make it possible to implement a Unix-like environment with acceptable performance almost entirely in an untrusted user-level library. The system has no notion of superuser and no fully trusted code other than the kernel. HiStar's features permit several novel applications, including privacy-preserving, untrusted virus scanners and a dynamic Web server with only a few thousand lines of trusted code.National Science Foundation (U.S.) (Cybertrust Award CNS-0716806)National Science Foundation (U.S.) (Cybertrust/DARPA Grant CNS-0430425
- …